Nyström Method with Kernel K-means++ Samples as Landmarks

نویسندگان

  • Dino Oglic
  • Thomas Gärtner
چکیده

We investigate, theoretically and empirically, the effectiveness of kernel K-means++ samples as landmarks in the Nyström method for low-rank approximation of kernel matrices. Previous empirical studies (Zhang et al., 2008; Kumar et al., 2012) observe that the landmarks obtained using (kernel) K-means clustering define a good lowrank approximation of kernel matrices. However, the existing work does not provide a theoretical guarantee on the approximation error for this approach to landmark selection. We close this gap and provide the first bound on the approximation error of the Nyström method with kernel Kmeans++ samples as landmarks. Moreover, for the frequently used Gaussian kernel we provide a theoretically sound motivation for performing Lloyd refinements of kernel K-means++ landmarks in the instance space. We substantiate our theoretical results empirically by comparing the approach to several state-of-the-art algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds

Kernel k-means clustering can correctly identify and extract a far more varied collection of cluster structures than the linear k-means clustering algorithm. However, kernel kmeans clustering is computationally expensive when the non-linear feature map is highdimensional and there are many input points. Kernel approximation, e.g., the Nyström method, has been applied in previous works to approx...

متن کامل

Fast DPP Sampling for Nystrom with Application to Kernel Methods

The Nyström method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nyström using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarante...

متن کامل

Randomized Clustered Nystrom for Large-Scale Kernel Machines

The Nyström method has been popular for generating the low-rank approximation of kernel matrices that arise in many machine learning problems. The approximation quality of the Nyström method depends crucially on the number of selected landmark points and the selection procedure. In this paper, we present a novel algorithm to compute the optimal Nyström low-approximation when the number of landm...

متن کامل

The Variational Nystrom method for large-scale spectral problems

Spectral methods for dimensionality reduction and clustering require solving an eigenproblem defined by a sparse affinity matrix. When this matrix is large, one seeks an approximate solution. The standard way to do this is the Nyström method, which first solves a small eigenproblem considering only a subset of landmark points, and then applies an out-of-sample formula to extrapolate the solutio...

متن کامل

Density-Weighted Nyström Method for Computing Large Kernel Eigensystems

The Nyström method is a well-known sampling-based technique for approximating the eigensystem of large kernel matrices. However, the chosen samples in the Nyström method are all assumed to be of equal importance, which deviates from the integral equation that defines the kernel eigenfunctions. Motivated by this observation, we extend the Nyström method to a more general, density-weighted versio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017